home *** CD-ROM | disk | FTP | other *** search
- \section*{Resolving trailing dots\ldots}
- Gregory Tucker-Kellogg\\
- gtk@walsh.med.harvard.edu
- \subsection*{Introduction}
- Unlike \verb|WEB|, \verb|noweb| does not allow the use of trailing
- dots in chunk (section) names. \verb|Dots| corrects for this. It is
- similar but not identical to \verb|disambiguate|, an \verb|Icon|
- program to accomplish the same task. \verb|Dots| is written in
- \verb|perl|.
- Before it does much else, \verb|noweb| creates a markup description of
- a source file. That markup description is passed along (in both
- \verb|noweave| and \verb|notangle|) to other programs in the pipeline
- (\verb|totex| for \verb|noweave| and \verb|nt| for \verb|notangle|).
- \verb|Dots| intervenes after the markup stage as a filter. The chunk
- name references are passed in the form described in Ramsey's paper,
- i.e.,
- \begin{quote}
- \leavevmode\rlap{\begin{tabular}{ll}
- \tt @defn {\rm\it name}&The code chunk named {\rm\it name} is being defined\\
- \tt @use {\rm\it name}&A reference to code chunk named {\rm\it name}\\
- \end{tabular}}
- \end{quote}
- If trailing dots are used in a chunk name, they will be passed along
- at the markup stage verbatim without any attempt at resolution.
- That's where \verb|dots| comes in.
- We require two passes over the noweb code as passed through
- \verb|markup|. The first pass picks out all of the unambigious chunk
- names and stores them in associative arrays. In between the passes,
- we expand the ambigious names and do some simple error checking. The
- second pass does a simple replace on incomplete names and writes
- output to the next stage of the pipeline.
- The choices for handling the input stream seems to be between sucking
- the whole markup into memory at once (as \verb|disambiguate| does) or storing
- the markup as a temporary file between the passes. The second is
- slower but will not break as the file gets bigger. We'll choose the
- first for now.
- \subsection*{Program outline}
- <<*>>=
- #!/usr/local/bin/perl
- while (<>) { # the first pass takes the input from STDIN
- <<create lists of identifiers>>
- <<resolve ambiguities in identifier names>>
- <<printout while replacing those with trailing dots>>
- \subsection*{Representation}
- What's the best structure for the list of chunk names? It could just be a
- normal array, except we would have to check if a given name is already
- defined before adding it too the list. We could make an associative
- array, except we really don't have a key to associate. On the other
- hand, we could make a single associative array of names with
- associations ``complete'' and ``incomplete'' depending on the
- presence of dots. This would require no checking on predefinitions,
- and a key sorted list brings up each full chunk name as the {\em next}
- member of the list for which [[$completion{$identifier}=$complete]].
- <<create lists of identifiers>>=
- if (/^@(defn|use)\s(.*)$/) { # we've found a name of some sort
- if (($truncated = $2) =~ s/\.\.\.$//) { # this one ends in dots.
- $completion{$truncated} ="incomplete";
- $truncations{$.-1} = $truncated;
- $usage_type{$.-1} = $1;
- else {$completion{$2} ="complete";}
- push(lines,$_);
- \subsection*{Chunkname resolution}
- The associative array [[%completion]] contains all of the names. The
- associative array [[%truncation_table]] contains the line numbers of the
- names with trailing dots. We can change the values of [[%completion]]
- from ``complete'' and ``incomplete'' to a number representing the
- index of the appropriate completion. If there is more than one, we
- can print out a warning but still resolve on the closest name.
- <<resolve ambiguities in identifier names>>=
- @namelist = sort(keys(%completion));
- $j = $i = 0;
- while ($i < $#namelist) { #collect all the ambiguities in a row
- while ($completion{$namelist[$j]} eq "incomplete") {
- $ambiguity_found = 1;
- $j = $i + 1;
- <<check for remaining ambiguity>>
- foreach $name (@namelist[$i..$j]) {
- $completion{$name} = $namelist[$j];
- $j=$i=$j + 1;
- undef($ambiguity_found);
- After we've gotten the expansions of abbreviated chunk names, we still
- might run into a problem. First, if no correct expansion was
- established, we might just missassign the abbreviation. The expansion
- might still be ambiguous if more than one complete expansion can give
- the same abbreviation. The first case is a fatal error. The second
- can be resolved by seeing if a complete chunkname immediately
- following the first completion is a solution. If so, we take the
- first completion anyway but print a warning for the user.
- <<check for remaining ambiguity>>=
- if (defined $ambiguity_found) {
- $suggested = $namelist[$j];
- $nextchance = $namelist[$j+1];
- foreach $name (@namelist[$i..$j-1]) {
- if (substr($suggested,0,length($name)) ne $name) {
- die "FATAL ERROR: can't resolve @<<$name...>>\n"
- }
- if ($completion{$nextchance} eq "complete") {
- foreach $name (@namelist[$i..$j-1]) {
- if (substr($nextchance,0,length($name)) eq $name) {
- print STDERR "WARNING--Ambiguous chunkname:\n";
- print STDERR "\t<<${name}...@>> could be either\n";
- print STDERR "\t<<$suggested@>> or\n\t<<$nextchance@>>\n";
- print STDERR "I will use <<$suggested@>>\n"
- }
- }
- \subsection*{Printout}
- Finally, the [[%truncations]] and [[%usage_type]] arrays are put to
- work. We use the line numbers (as [[keys()]]) to pull up the
- truncations, and then associate truncations with completed names.
- Since we found everything on the first pass we don't have to scan
- each line for a [[@defn]] or [[@use]] statement. Note: this part of
- the program, analogous to {\em pass2} in \verb|disambiguate|, is
- different from \verb|disambiguiate|, which went through a search on
- the second pass. If we decided to store the markup in a temporary
- file after the first pass to save memory, we would change this section
- for blockwise printout. We still would not be forced to scan each
- line.
- <<printout while replacing those with trailing dots>>=
- foreach $trunc_line (sort(keys(%truncations))) {
- $lines[$trunc_line] =
- "\@$usage_type{$trunc_line} $completion{$truncations{$trunc_line}}\n";
- print @lines;
-